Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Nat Biomed Eng ; 7(7): 853-866, 2023 07.
Artigo em Inglês | MEDLINE | ID: mdl-36536253

RESUMO

Variant callers typically produce massive numbers of false positives for structural variations, such as cancer-relevant copy-number alterations and fusion genes resulting from genome rearrangements. Here we describe an ultrafast and accurate detector of somatic structural variations that reduces read-mapping costs by filtering out reads matched to pan-genome k-mer sets. The detector, which we named ETCHING (for efficient detection of chromosomal rearrangements and fusion genes), reduces the number of false positives by leveraging machine-learning classifiers trained with six breakend-related features (clipped-read count, split-reads count, supporting paired-end read count, average mapping quality, depth difference and total length of clipped bases). When benchmarked against six callers on reference cell-free DNA, validated biomarkers of structural variants, matched tumour and normal whole genomes, and tumour-only targeted sequencing datasets, ETCHING was 11-fold faster than the second-fastest structural-variant caller at comparable performance and memory use. The speed and accuracy of ETCHING may aid large-scale genome projects and facilitate practical implementations in precision medicine.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Neoplasias , Humanos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Genoma , Análise de Sequência de DNA/métodos
2.
Comput Struct Biotechnol J ; 18: 814-820, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32308928

RESUMO

The Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR)-Cas systems, including dead Cas9 (dCas9), Cas9, and Cas12a, have revolutionized genome engineering in mammalian somatic cells. Although computational tools that assess the target sites of CRISPR-Cas systems are inevitably important for designing efficient guide RNAs (gRNAs), they exhibit generalization issues in selecting features and do not provide optimal results in a comprehensive manner. Here, we introduce a Comprehensive Guide Designer (CGD) for four different CRISPR systems, which utilizes the machine learning algorithm, Elastic Net Logistic Regression (ENLOR), to autonomously generalize the models. CGD contains specific models trained with public datasets generated by CRISPRi, CRISPRa, CRISPR-Cas9, and CRISPR-Cas12a (designated as CGDi, CGDa, CGD9, and CGD12a, respectively) in an unbiased manner. The trained CGD models were benchmarked to other regression-based machine learning models, such as ElasticNet Linear Regression (ENLR), Random Forest and Boruta (RFB), and Extreme Gradient Boosting (Xgboost) with inbuilt feature selection. Evaluation with independent test datasets showed that CGD models outperformed the pre-existing methods in predicting the efficacy of gRNAs. All CGD source codes and datasets are available at GitHub (https://github.com/vipinmenon1989/CGD), and the CGD webserver can be accessed at http://big.hanyang.ac.kr:2195/CGD.

3.
Gigascience ; 7(7)2018 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-30010758

RESUMO

Background: Yeonsan Ogye (YO), an indigenous Korean chicken breed (Gallus gallus domesticus), has entirely black external features and internal organs. In this study, the draft genome of YO was assembled using a hybrid de novo assembly method that takes advantage of high-depth Illumina short reads (376.6X) and low-depth Pacific Biosciences (PacBio) long reads (9.7X). Findings: The contig and scaffold NG50s of the hybrid de novo assembly were 362.3 Kbp and 16.8 Mbp, respectively. The completeness (97.6%) of the draft genome (Ogye_1.1) was evaluated with single-copy orthologous genes using Benchmarking Universal Single-Copy Orthologs and found to be comparable to the current chicken reference genome (galGal5; 97.4%; contigs were assembled with high-depth PacBio long reads (50X) and scaffolded with short reads) and superior to other avian genomes (92%-93%; assembled with short read-only or hybrid methods). Compared to galGal4 and galGal5, the draft genome included 551 structural variations including the fibromelanosis (FM) locus duplication, related to hyperpigmentation. To comprehensively reconstruct transcriptome maps, RNA sequencing and reduced representation bisulfite sequencing data were analyzed from 20 tissues, including 4 black tissues (skin, shank, comb, and fascia). The maps included 15,766 protein-coding and 6,900 long noncoding RNA genes, many of which were tissue-specifically expressed and displayed tissue-specific DNA methylation patterns in the promoter regions. Conclusions: We expect that the resulting genome sequence and transcriptome maps will be valuable resources for studying domestic chicken breeds, including black-skinned chickens, as well as for understanding genomic differences between breeds and the evolution of hyperpigmented chickens and functional elements related to hyperpigmentation.


Assuntos
Galinhas/genética , Genômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Animais , Análise por Conglomerados , Mapeamento de Sequências Contíguas , Ilhas de CpG , Metilação de DNA , Perfilação da Expressão Gênica , Genoma , Mutação INDEL , Polimorfismo de Nucleotídeo Único , Regiões Promotoras Genéticas , RNA Longo não Codificante/genética , Análise de Sequência de RNA , Transcriptoma
4.
Brief Bioinform ; 19(1): 23-40, 2018 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-27742661

RESUMO

As the advent of next-generation sequencing (NGS) technology, various de novo assembly algorithms based on the de Bruijn graph have been developed to construct chromosome-level sequences. However, numerous technical or computational challenges in de novo assembly still remain, although many bright ideas and heuristics have been suggested to tackle the challenges in both experimental and computational settings. In this review, we categorize de novo assemblers on the basis of the type of de Bruijn graphs (Hamiltonian and Eulerian) and discuss the challenges of de novo assembly for short NGS reads regarding computational complexity and assembly ambiguity. Then, we discuss how the limitations of the short reads can be overcome by using a single-molecule sequencing platform that generates long reads of up to several kilobases. In fact, the long read assembly has caused a paradigm shift in whole-genome assembly in terms of algorithms and supporting steps. We also summarize (i) hybrid assemblies using both short and long reads and (ii) overlap-based assemblies for long reads and discuss their challenges and future prospects. This review provides guidelines to determine the optimal approach for a given input data type, computational budget or genome.


Assuntos
Genoma Humano , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Sequenciamento Completo do Genoma/métodos , Algoritmos , Genômica , Humanos , Software
5.
Phys Rev E ; 93: 042121, 2016 04.
Artigo em Inglês | MEDLINE | ID: mdl-27176268

RESUMO

We study coarse-grained entropy production in an asymmetric random walk system on a periodic one-dimensional lattice. In coarse-grained systems, the original dynamics are unavoidably destroyed, but the coarse-grained entropy production is not hidden below the critical time-scale separation. The hidden entropy production is rapidly increasing near the critical time-scale separation.

6.
Phys Rev E Stat Nonlin Soft Matter Phys ; 75(6 Pt 1): 061110, 2007 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-17677223

RESUMO

We study the critical properties of the majority voter model by using two different transition rates: the Glauber rate and the Metropolis rate. The model with the Glauber rate has been found to be mapped to the majority voter model with noise [de Oliveira, J. Stat. Phys. 66, 273 (1992)]. The critical temperature and the critical exponents for the two transition rates are obtained from a Monte Carlo simulation with a finite size scaling analysis. The critical temperature is found to depend on the transition rate, but the critical exponents do not. The values of the critical exponents obtained indicate that the model belongs to the same universality class as the Ising model, regardless of the type of transition rate.

7.
Phys Rev E Stat Nonlin Soft Matter Phys ; 75(6 Pt 1): 061130, 2007 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-17677243

RESUMO

Applying the histogram reweighting method, we investigate the critical behavior of the XY model on growing scale-free networks with various degree exponents lambda. For lambda < or = 3 , the critical temperature diverges as it does for the Ising model on scale-free networks. For lambda=8 , on the other hand, we observe a second-order phase transition at finite temperature. We obtain the critical temperature T{c}=3.08(2) and the critical exponents nu=2.62(3) , gammanu=0.127(4) , and betanu=0.442(2) from a finite-size scaling analysis.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...